HTML Parsing for multiple input files using java code [closed]
Posted
by
mkp
on Programmers
See other posts from Programmers
or by mkp
Published on 2012-06-22T09:39:20Z
Indexed on
2012/06/22
15:24 UTC
Read the original article
Hit count: 436
FileReader f0 = new FileReader("123.html");
StringBuilder sb = new StringBuilder();
BufferedReader br = new BufferedReader(f0);
while((temp1=br.readLine())!=null)
{ sb.append(temp1); }
String para = sb.toString().replaceAll("<br>","\n");
String textonly = Jsoup.parse(para).text();
System.out.println(textonly);
FileWriter f1=new FileWriter("123.txt");
char buf1[] = new char[textonly.length()];
textonly.getChars(0,textonly.length(),buf1,0);
for(i=0;i<buf1.length;i++) {
if(buf1[i]=='\n')
f1.write("\r\n");
f1.write(buf1[i]);
}
I've this code but it is taking only one file at a time. I want to select multiple files. I've 2000 files and I've given them numbering name from 1 to 2000 as "1.html". So I want to give for loop like for(i=1;i<=2000;i++)
and after executing separate txt file should be generated.
© Programmers or respective owner